System and method for enabling micro-partitioning in a multi-threaded processor

ABSTRACT

A system and method for allowing jobs originating from different partitions to simultaneously utilize different hardware threads on a processor by concatenating partition identifiers with virtual page identifiers within a processor&#39;s translation lookaside buffer is presented. The device includes a translation lookaside buffer that translates concatenated virtual addresses to system-wide real addresses. The device generates concatenated virtual addresses using a partition identifier, which corresponds to a job&#39;s originating partition, and a virtual page identifier, which corresponds to the executing instruction, such as an instruction address or data address. In turn, each concatenated virtual address is different, which translates in the translation lookaside buffer to a unique system-wide real address. As such, jobs originating from different partitions are able to simultaneously execute on the device and, therefore, fully utilize each of the device&#39;s hardware threads.

BACKGROUND OF THE INVENTION

1. Technical Field

The present invention relates to a system and method for enablingmicro-partitioning in a multi-threaded processor. More particularly, thepresent invention relates to a system and method for permittingdifferent partitions to simultaneously utilize a processor's differenthardware threads by concatenating partition identifiers with virtualpage identifiers within the processor's translation lookaside buffer.

2. Description of the Related Art

Today's processors include multiple hardware threads for simultaneouslyexecuting tasks. In addition, processors dynamically reconfigure theirresources into “partitions” using a shared resource pool. Thesepartitions invoke jobs (processes) that, in turn, execute on one of thehardware threads.

A challenge found, however, is that today's processors do not allowdifferent partitions to simultaneously utilize different hardwarethreads. At any given time, only jobs originating from one partition mayexecute on multiple threads. For example, partition A may invoke jobs 1,2, and 3 that may simultaneously execute on hardware threads X, Y, andZ. However, jobs originating from different partitions (e.g., partitionA, partition B, partition C) are not able to simultaneously execute ondifferent hardware threads due to existing address translationlimitations. As such, a processor's multi-threaded capability is wastedwhen a particular partition does not utilize all of a processor'shardware threads.

What is needed, therefore, is a system and method for enabling jobsoriginating from different partitions to simultaneously execute on amultiple hardware thread processor.

SUMMARY

It has been discovered that the aforementioned challenges are resolvedusing a system and method for allowing jobs originating from differentpartitions to simultaneously utilize different hardware threads on aprocessor by concatenating partition identifiers with virtual pageidentifiers, which results in a concatenated virtual address that theprocessor translates to a system-wide real address using a translationlookaside buffer.

A device includes multiple hardware threads and multiple partitions.Each partition comprises a subset of the device's resources that arepart of a shared resource pool, which the device virtualizes andutilizes as separate entities. Each partition invokes jobs, orprocesses, which the device queues in a job queue for execution by oneof the hardware threads.

In order to effectively process address translation requests from jobsthat originate from different partitions, the device includes atranslation lookaside buffer that translates concatenated virtualaddresses to system-wide real addresses. The device generatesconcatenated virtual addresses using a partition identifier, whichcorresponds to a job's originating partition, and a virtual pageidentifier, which corresponds to the executing instruction, such as aninstruction address or data address. In turn, each concatenated virtualaddress is different, which translates in the translation lookasidebuffer to a different system-wide real address. As such, jobsoriginating from different partitions are able to simultaneously executeon the device and, therefore, fully utilize each of the device'shardware threads.

The foregoing is a summary and thus contains, by necessity,simplifications, generalizations, and omissions of detail; consequently,those skilled in the art will appreciate that the summary isillustrative only and is not intended to be in any way limiting. Otheraspects, inventive features, and advantages of the present invention, asdefined solely by the claims, will become apparent in the non-limitingdetailed description set forth below.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention may be better understood, and its numerousobjects, features, and advantages made apparent to those skilled in theart by referencing the accompanying drawings.

FIG. 1 is a diagram showing a device simultaneously processing jobs thatoriginate from two different partitions;

FIG. 2A is a diagram showing two hardware threads executing differentjobs originating from the same partition;

FIG. 2B is a diagram showing two jobs, which originate from differentpartitions, simultaneously executing on two different hardware threadsthat reside on a single device;

FIG. 2C is a diagram showing two jobs, which originate from differentpartitions, simultaneously executing on two different hardware threadsthat reside on a single device;

FIG. 3 is a diagram showing a translation lookaside buffer that includesconcatenated virtual addresses and corresponding system-wide realaddresses;

FIG. 4 is a high-level flowchart showing steps taken in simultaneouslyexecuting jobs that originate from different partitions;

FIG. 5 is a flowchart showing steps taken in processing a job using ahardware thread;

FIG. 6 is a flowchart showing steps taken in translating a virtual pageidentifier to a system-wide real address by concatenating the virtualpage identifier with a partition identifier; and

FIG. 7 is a block diagram of a computing device capable of implementingthe present invention.

DETAILED DESCRIPTION

The following is intended to provide a detailed description of anexample of the invention and should not be taken to be limiting of theinvention itself. Rather, any number of variations may fall within thescope of the invention, which is defined in the claims following thedescription.

FIG. 1 is a diagram showing a device simultaneously processing jobs thatoriginate from two different partitions using multiple hardware threads.Device 100 includes partitions 1 100 through 4 115. Each partitioncomprises a subset of device 100's resources, which are part of a sharedresource pool, that device 100 virtualizes and uses as separateentities. Each partition invokes jobs, or processes, which are queued injob queue 120 for execution by hardware thread A 125 or hardware threadB 130.

Both hardware thread A 125 and hardware thread B 130 share translationlookaside buffer (TLB) 140. TLB 140 includes a table with concatenatedvirtual addresses and corresponding system-wide real addresses. Theconcatenated virtual addresses are generated by concatenating a job'spartition identifier and a virtual page identifier. In turn, eachconcatenated virtual address is different, which translates to adifferent system-wide real address. As such, jobs originating fromdifferent partitions are able to simultaneously execute on device 100using hardware thread A 125 and hardware thread B 130 (see FIG. 3 andcorresponding text for further details).

FIG. 1 shows that job queue 120 includes two jobs from partition 1 (P1Job A 150 and P1 155), one job from partition 2 105 (P2 Job A 160), onejob from partition 3 110 (P2 Job A 170), and two jobs from partition 4115 (P4 Job A 180 and P4 Job B 185). As the jobs work their way down jobqueue 120, device 100 loads each job in either hardware thread A 125 orhardware thread B 130, whichever thread is available. Since TLB 140includes concatenated virtual addresses, hardware threads A 125 and B130 are able to use TLB 140 to effectively translate virtual pageidentifiers to system-wide real addresses by using a currently executingjob's corresponding partition identifier (see FIGS. 2A-2C, 3, andcorresponding text for further details). In one embodiment, theinvention described herein operates in a multi-processor environment. Inthis embodiment, a single processor processes job requests originatingfrom different partitions using its translation lookaside buffer andhardware threads even though the different partitions may reside ondifferent processors.

FIG. 2A is a diagram showing two hardware threads executing differentjobs that originate from the same partition. Referring back to FIG. 1,P1 job A 150 and P1 job B 155 were the first two jobs in job queue 120ready for execution. FIG. 2A shows that P1 job A 150 loads into hardwarethread A 125 and P1 job B 155 loads into hardware thread B 130. In turn,the remaining jobs included in job queue 120 move closer to the front ofjob queue 120. When either P1 Job A 150 or P1 Job B 155 finishes, thenext job in the queue (P2 Job A 160) loads into the available hardwarethread and commences execution, regardless of whether the other hardwarethread is still executing a job from the first partition (see FIG. 2Band corresponding text for further details).

FIG. 2B is a diagram showing two jobs, which originate from differentpartitions, simultaneously executing on two different hardware threadsthat reside on a single device. Referring back to FIG. 2A, FIG. 2B showsthat P1 job A 150 completes and P2 job A 160 loads onto hardware threadA 125 for execution. At this point, hardware thread A 125 and hardwarethread B 130 contain jobs that originate from different partitions.Although the jobs originate from different partitions, the inventiondescribed herein allows simultaneous execution due to the fact that thetranslation lookaside buffer (TLB) that hardware thread A 125 andhardware thread B 130 both access contains concatenated virtualaddresses that include a partition identifier that identifies aparticular partition. As a result, when a job originating from apartition requests a real address from the TLB, the TLB is able toprovide a correct system-wide real address (see FIG. 3 and correspondingtext for further details).

FIG. 2C is a diagram showing two jobs, which originate from differentpartitions, simultaneously executing on two different hardware threadsthat reside on a single device. Referring back to FIG. 2B, FIG. 2C showsthat P1 job B 155 completes and P3 job A 170 loads into hardware threadB 130. Again, the invention described here allows hardware thread A 125to execute P2 job A 160 at the same time that hardware thread B 130executes P3 job A 170 even though they originate from differentpartitions.

FIG. 3 is a diagram showing a translation lookaside buffer that includesconcatenated virtual addresses and corresponding system-wide realaddresses. TLB 140, which is the same as that shown in FIG. 1, includescolumns 310 and 320. Column 310 includes concatenated virtual addresses,which are addresses that are generated using a partition identifier anda virtual page identifier. As such, the invention described hereinallows jobs originating from multiple partitions to utilize TLB 140 dueto the fact that each entry within column 310 is different even thoughmany of the entries may be based upon the same virtual page identifier.For example, row 330 and 340 include the same virtual page identifier(VPID1), but row 330 corresponds to a first partition (partition 1identifier) and row 340 corresponds to a second partition (partition 2identifier). In turn, when the first partition sends a request to accessmemory using virtual page identifier “V1,” processing concatenatespartition 1 identifier with virtual page identifier V1 and retrievesreal address “Ra” from row 330. Likewise, when the second partitionsends a request to access memory using virtual page identifier “V1,”processing concatenates partition 2 identifier with virtual pageidentifier V1 and retrieves real address “Rb” from row 340.

Rows 350 and 360 show a similar situation using virtual page identifier“V2.” when the first partition sends a request to access memory usingvirtual page identifier “V2,” processing concatenates partition 1identifier with virtual page identifier V2 and retrieves real address“Rx” from row 350. Likewise, when the second partition sends a requestto access memory using virtual page identifier “V2,” processingconcatenates partition 2 identifier with virtual page identifier V2 andretrieves real address “Ry” from row 360.

FIG. 4 is a high-level flowchart showing steps taken in simultaneouslyexecuting jobs that originate from different partitions. Processingcommences at 400, whereupon processing waits for a job request from oneof partitions 415 at step 410, such as partitions 1 100 through 4 115shown in FIG. 1. When processing receives a job request, processingloads the job in job queue 120 (step 420), which queues the job forexecution. Job queue 120 is the same as that shown in FIG. 1.

A determination is made as to whether a hardware thread is available(decision 430). If a hardware thread is not available, decision 430branches to “No” branch 432, which loops back to receive more jobrequests from partitions 415. This looping continues until a hardwarethread is available, at which point decision 430 branches to “Yes”branch 438 whereupon processing loads a job that is next in line in jobqueue 120 into hardware thread 450, which is the available hardwarethread.

Processing executes the job using hardware thread 450 independent ofother executing jobs that originate from different partitions by using atranslation lookaside buffer that translates concatenated virtualaddresses to system-wide real addresses (pre-defined process block 460,see FIG. 5 and corresponding text for further details).

A determination is made as to whether to continue processing (decision470). If processing should continue, decision 470 branches to “Yes”branch 472, which loops back to receive and process more job requests.This looping continues until processing should terminate, at which pointdecision 470 branches to “No” branch 478 whereupon processing ends at480.

FIG. 5 is a flowchart showing steps taken in processing a job using ahardware thread. Job processing commences at 500, whereupon processingreads a program counter for the job's next instruction address (step505). The next instruction address corresponds to a virtual pageidentifier, which processing concatenates with the job's correspondingpartition identifier and translates into a system-wide real addressusing TLB 140 (pre-defined process block 510, see FIG. 6 andcorresponding text for further details).

At step 515, processing fetches the instruction using the system-widereal address. A determination is made as to whether the fetchedinstruction is a memory instruction, such as a load or store instruction(decision 520). If the instruction is not a memory instruction, decision520 branches to “No” branch 528 whereupon processing processes theinstruction at step 540. On the other hand, if the instruction is amemory instruction, decision 520 branches to “Yes” branch 522 whereuponprocessing translates the memory location into a system-wide realaddress using the job's corresponding partition identifier and thevirtual page identifier that corresponds to the memory location(pre-defined process block 530, see FIG. 6 and corresponding text forfurther details). Processing then executes the instruction using thesystem-wide real address at step 540.

A determination is made as to whether the job is complete (decision550). If the job is not complete, decision 550 branches to “No” branch552 whereupon processing loops back to process the next instruction.This looping continues until the job is complete, at which pointdecision 550 branches to “Yes” branch 558 whereupon processing returnsat 560.

FIG. 6 is a flowchart showing steps taken in translating a virtual pageidentifier to a system-wide real address by concatenating the virtualpage identifier with a partition identifier. Processing commences at600, whereupon processing extracts the virtual page identifier from atranslation request by that is received from a job that is currentlyexecuting (see FIG. 5 and corresponding text for further details). Atstep 630, processing identifies a partition identifier that correspondsto the partition that invoked the executing job. Next, processingconcatenates the partition identifier with the virtual page identifierin order to generate a unique concatenated virtual address (step 640).

Using the concatenated virtual address, processing looks-up asystem-wide real address in TLB 140 at step 650. Since processing usesconcatenated virtual addresses, multiple jobs originating from differentpartitions may be simultaneously executed because each concatenatedvirtual address corresponds to a single system-wide real address (seeFIG. 3 and corresponding text for further details). At step 660,processing provides the system-wide real address to job 670 and returnsat 680.

FIG. 7 illustrates information handling system 701 which is a simplifiedexample of a computer system capable of performing the computingoperations described herein. Computer system 701 includes processor 700which is coupled to host bus 702. A level two (L2) cache memory 704 isalso coupled to host bus 702. Host-to-PCI bridge 706 is coupled to mainmemory 708, includes cache memory and main memory control functions, andprovides bus control to handle transfers among PCI bus 710, processor700, L2 cache 704, main memory 708, and host bus 702. Main memory 708 iscoupled to Host-to-PCI bridge 706 as well as host bus 702. Devices usedsolely by host processor(s) 700, such as LAN card 730, are coupled toPCI bus 710. Service Processor Interface and ISA Access Pass-through 712provides an interface between PCI bus 710 and PCI bus 714. In thismanner, PCI bus 714 is insulated from PCI bus 710. Devices, such asflash memory 718, are coupled to PCI bus 714. In one implementation,flash memory 718 includes BIOS code that incorporates the necessaryprocessor executable code for a variety of low-level system functionsand system boot functions.

PCI bus 714 provides an interface for a variety of devices that areshared by host processor(s) 700 and Service Processor 716 including, forexample, flash memory 718. PCI-to-ISA bridge 735 provides bus control tohandle transfers between PCI bus 714 and ISA bus 740, universal serialbus (USB) functionality 745, power management functionality 755, and caninclude other functional elements not shown, such as a real-time clock(RTC), DMA control, interrupt support, and system management bussupport. Nonvolatile RAM 720 is attached to ISA Bus 740. ServiceProcessor 716 includes JTAG and I2C busses 722 for communication withprocessor(s) 700 during initialization steps. JTAG/I2C busses 722 arealso coupled to L2 cache 704, Host-to-PCI bridge 706, and main memory708 providing a communications path between the processor, the ServiceProcessor, the L2 cache, the Host-to-PCI bridge, and the main memory.Service Processor 716 also has access to system power resources forpowering down information handling device 701.

Peripheral devices and input/output (I/O) devices can be attached tovarious interfaces (e.g., parallel interface 762, serial interface 764,keyboard interface 768, and mouse interface 770 coupled to ISA bus 740.Alternatively, many I/O devices can be accommodated by a super I/Ocontroller (not shown) attached to ISA bus 740.

In order to attach computer system 701 to another computer system tocopy files over a network, LAN card 730 is coupled to PCI bus 710.Similarly, to connect computer system 701 to an ISP to connect to theInternet using a telephone line connection, modem 775 is connected toserial port 764 and PCI-to-ISA Bridge 735.

While FIG. 7 shows one information handling system that employsprocessor(s) 700, the information handling system may take many forms.For example, information handling system 701 may take the form of adesktop, server, portable, laptop, notebook, or other form factorcomputer or data processing system. Information handling system 701 mayalso take other form factors such as a personal digital assistant (PDA),a gaming device, ATM machine, a portable telephone device, acommunication device or other devices that include a processor andmemory.

One of the preferred implementations of the invention is a clientapplication, namely, a set of instructions (program code) in a codemodule that may, for example, be resident in the random access memory ofthe computer. Until required by the computer, the set of instructionsmay be stored in another computer memory, for example, in a hard diskdrive, or in a removable memory such as an optical disk (for eventualuse in a CD ROM) or floppy disk (for eventual use in a floppy diskdrive). Thus, the present invention may be implemented as a computerprogram product for use in a computer. In addition, although the variousmethods described are conveniently implemented in a general purposecomputer selectively activated or reconfigured by software, one ofordinary skill in the art would also recognize that such methods may becarried out in hardware, in firmware, or in more specialized apparatusconstructed to perform the required method steps.

While particular embodiments of the present invention have been shownand described, it will be obvious to those skilled in the art that,based upon the teachings herein, that changes and modifications may bemade without departing from this invention and its broader aspects.Therefore, the appended claims are to encompass within their scope allsuch changes and modifications as are within the true spirit and scopeof this invention. Furthermore, it is to be understood that theinvention is solely defined by the appended claims. It will beunderstood by those with skill in the art that if a specific number ofan introduced claim element is intended, such intent will be explicitlyrecited in the claim, and in the absence of such recitation no suchlimitation is present. For non-limiting example, as an aid tounderstanding, the following appended claims contain usage of theintroductory phrases “at least one” and “one or more” to introduce claimelements. However, the use of such phrases should not be construed toimply that the introduction of a claim element by the indefinitearticles “a” or “an” limits any particular claim containing suchintroduced claim element to inventions containing only one such element,even when the same claim includes the introductory phrases “one or more”or “at least one” and indefinite articles such as “a” or “an”; the sameholds true for the use in the claims of definite articles.

1. A computer-implemented method comprising: loading a first jobcorresponding to a first partition onto a first hardware thread; loadinga second job corresponding to a second partition onto a second hardwarethread, wherein the second partition is different than the firstpartition; and simultaneously executing the first job on a firsthardware thread and the second job on a second hardware thread, whereinthe first hardware thread and the second hardware thread are co-locatedon a first processor.
 2. The method of claim 1 wherein the first job andthe second job share a single translation lookaside buffer located onthe first processor.
 3. The method of claim 2 wherein the translationlookaside buffer comprises a first set of concatenated virtual addressesand a second set of concatenated virtual addresses, the first set ofconcatenated virtual addresses corresponding to a first partitionidentifier associated with the first partition and the second set ofconcatenated virtual addresses corresponding to a second partitionidentifier associated with the second partition.
 4. The method of claim3 further comprising: concatenating the first partition identifier witha virtual page identifier, resulting in a first concatenated virtualaddress that is included in the first set of concatenated virtualaddresses; and concatenating the second partition identifier with thevirtual page identifier that is the same virtual page identifier thatwas concatenated with the first partition identifier, resulting in asecond concatenated virtual address that is included in the second setof concatenated virtual addresses.
 5. The method of claim 4 wherein thefirst concatenated virtual address translates to a first system-widereal address in the translation lookaside buffer, and wherein the secondvirtual address translates to a second system-wide real address in thetranslation lookaside buffer, the first system-wide real address beingdifferent from the second system-wide real address throughout a computersystem that includes the first processor.
 6. The method of claim 1wherein the first partition corresponds to a first subset of resourcesincluded in the processor, and wherein the second partition correspondsto a second subset of resources included in the processor that aredifferent from the first subset of resources.
 7. The method of claim 1wherein the first partition is invoked by the first processor and thesecond partition is invoked by a second processor.
 8. A computer programproduct stored on a computer operable media, the computer operable mediacontaining instructions for execution by a computer, which, whenexecuted by the computer, cause the computer to implement a method ofprocessing jobs, the method comprising: loading a first jobcorresponding to a first partition onto a first hardware thread; loadinga second job corresponding to a second partition onto a second hardwarethread, wherein the second partition is different than the firstpartition; and simultaneously executing the first job on a firsthardware thread and the second job on a second hardware thread, whereinthe first hardware thread and the second hardware thread are co-locatedon a first processor.
 9. The computer program product of claim 8 whereinthe first job and the second job share a single translation lookasidebuffer located on the first processor.
 10. The computer program productof claim 9 wherein the translation lookaside buffer comprises a firstset of concatenated virtual addresses and a second set of concatenatedvirtual addresses, the first set of concatenated virtual addressescorresponding to a first partition identifier associated with the firstpartition and the second set of concatenated virtual addressescorresponding to a second partition identifier associated with thesecond partition.
 11. The computer program product of claim 10 whereinthe method further comprises: concatenating the first partitionidentifier with a virtual page identifier, resulting in a firstconcatenated virtual address that is included in the first set ofconcatenated virtual addresses; and concatenating the second partitionidentifier with the virtual page identifier that is the same virtualpage identifier that was concatenated with the first partitionidentifier, resulting in a second concatenated virtual address that isincluded in the second set of concatenated virtual addresses.
 12. Thecomputer program product of claim 11 wherein the first concatenatedvirtual address translates to a first system-wide real address in thetranslation lookaside buffer, and wherein the second virtual addresstranslates to a second system-wide real address in the translationlookaside buffer, the first system-wide real address being differentfrom the second system-wide real address throughout a computer systemthat includes the first processor.
 13. The computer program product ofclaim 8 wherein the first partition corresponds to a first subset ofresources included in the processor, and wherein the second partitioncorresponds to a second subset of resources included in the processorthat are different from the first subset of resources.
 14. The computerprogram product of claim 8 wherein the first partition is invoked by thefirst processor and the second partition is invoked by a secondprocessor.
 15. An information handling system comprising: one or moreprocessors; a memory accessible by the processors; one or morenonvolatile storage devices accessible by the processors; and a set ofinstructions stored in the memory, wherein one or more of the processorsexecutes the set of instructions in order to perform actions of: loadinga first job corresponding to a first partition onto a first hardwarethread; loading a second job corresponding to a second partition onto asecond hardware thread, wherein the second partition is different thanthe first partition; and simultaneously executing the first job on afirst hardware thread and the second job on a second hardware thread,wherein the first hardware thread and the second hardware thread areco-located on a first processor.
 16. The information handling system ofclaim 15 wherein the first job and the second job share a singletranslation lookaside buffer located on the first processor.
 17. Theinformation handling system of claim 16 wherein the translationlookaside buffer comprises a first set of concatenated virtual addressesand a second set of concatenated virtual addresses, the first set ofconcatenated virtual addresses corresponding to a first partitionidentifier associated with the first partition and the second set ofconcatenated virtual addresses corresponding to a second partitionidentifier associated with the second partition.
 18. The informationhandling system of claim 17 further comprising an additional set ofinstructions in order to perform actions of: concatenating the firstpartition identifier with a virtual page identifier, resulting in afirst concatenated virtual address that is included in the first set ofconcatenated virtual addresses; and concatenating the second partitionidentifier with the virtual page identifier that is the same virtualpage identifier that was concatenated with the first partitionidentifier, resulting in a second concatenated virtual address that isincluded in the second set of concatenated virtual addresses.
 19. Theinformation handling system of claim 18 wherein the first concatenatedvirtual address translates to a first system-wide real address in thetranslation lookaside buffer, and wherein the second virtual addresstranslates to a second system-wide real address in the translationlookaside buffer, the first system-wide real address being differentfrom the second system-wide real address throughout a computer systemthat includes the first processor.
 20. The information handling systemof claim 15 wherein the first partition corresponds to a first subset ofresources included in the processor, and wherein the second partitioncorresponds to a second subset of resources included in the processorthat are different from the first subset of resources.